# MASpi: A Unified Environment for Evaluating Prompt Injection Robustness in LLM-Based Multi-Agent Systems

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

> **Note**: This is a supplementary version of MASpi. The complete version will be released after the review process is completed.

MASpi is a benchmarking framework designed to evaluate the robustness of Large Language Model-based Multi-Agent Systems (LLM-MAS) against various prompt injection attacks. It provides a unified environment for testing, comparing, and improving the security of LLM-MAS.

## 🌟 Features

- **Multi-Agent System Support**: Compatible with popular MAS frameworks including AutoGen, MetaGPT, AgentVerse, CAMEL, and MAD
- **Comprehensive Attack Suite**: Implements various attack strategies including hijacking, disruption, and exfiltration attacks
- **Defense Mechanisms**: Includes safety filters and BERT-based detectors for evaluating defense strategies
- **Multiple Task Domains**: Supports code generation, mathematical reasoning, and question-answering tasks
- **Flexible Configuration**: Easy-to-configure evaluation parameters and model settings
- **Extensible Architecture**: Modular design allows for easy addition of new attacks, defenses, and MAS frameworks

## 🏗️ Architecture

### Core Components

- **Multi-Agent Systems (MAS)**: Implementations of various MAS frameworks
- **Attack Modules**: Different types of prompt injection attacks
- **Evaluation Suite**: Task-specific evaluation frameworks
- **Task Executors**: Handlers for different task domains (code, math, QA)

### Supported Multi-Agent Systems

- **AutoGen**: Microsoft's multi-agent conversation framework
- **MetaGPT**: Multi-agent system for software development
- **AgentVerse**: Versatile multi-agent platform
- **CAMEL**: Communicative Agents for Mind Exploration
- **MAD**: Multi-Agent Debate system

### Attack Types

- **Hijacking Attack**: Attempts to take control of agent behavior
- **Disruption Attack**: Aims to disrupt normal agent operations
- **Exfiltration Attack**: Tries to extract sensitive information

## 🚀 Quickstart

### Prerequisites

- Python 3.10+
- Conda (recommended) or pip

### Installation

```bash
# Step 1: Create and activate the environment
conda create -n maspi python=3.10
conda activate maspi

# Step 2: Clone the repository
git clone https://github.com/[Author's github account]/MASpi.git
cd MASpi

# Step 3: Install dependencies
pip install -e .
```

### Basic Usage

```bash
# run with custom parameters
python benchmark.py \
  --mas autogen \
  --suite hijacking \
  --defense safety_filter \
  --task_domain code \
  --max_workers 8
```

## 📋 Configuration

### Model Configuration

Edit `configs/model.yaml` to configure your LLM:

```yaml
provider: openai
api_key: your-api-key
base_url: https://api.openai.com/v1
model_name: gpt-4o
temperature: 0.0
max_tokens: 1024
```

### Judge Configuration

Edit `configs/judge.yaml` for evaluation settings:

```yaml
provider: openai
api_key: your-api-key
base_url: https://api.openai.com/v1
model_name: gpt-4o-mini
temperature: 0.0
max_tokens: 1024
```

## 🎯 Usage Examples

### Running Different Attack Suites

```bash
# Hijacking attack on AutoGen
python benchmark.py --mas autogen --suite hijacking --task_domain code

# Disruption attack on MetaGPT
python benchmark.py --mas metagpt --suite disruption --task_domain math

# Exfiltration attack on CAMEL
python benchmark.py --mas camel --suite exfiltration --task_domain qa
```

### Testing Defense Mechanisms

```bash
# Test with safety filter
python benchmark.py --mas autogen --suite hijacking --defense safety_filter

# Test with BERT detector
python benchmark.py --mas autogen --suite hijacking --defense bert_detector
```


## 🔧 Advanced Configuration

### Custom Attack Modes

```bash
# Continuous attack mode
python benchmark.py --attack_mode continuous

# Single-shot attack mode
python benchmark.py --attack_mode single
```

### Targeting Specific Agents

```bash
# Target specific agents in the multi-agent system
python benchmark.py --malicious_agents agent_1 agent_2
```

### Parallel Processing

```bash
# Use multiple workers for faster evaluation
python benchmark.py --max_workers 16
```

## 📁 Project Structure

```
maspi/
├── agent_components/     # Base agent components and LLMs
├── attacks/             # Attack implementations
├── defenses/            # Defense mechanisms
├── evaluation/          # Evaluation framework and datasets
├── mas/                # Multi-agent system implementations
│   ├── autogen/        # AutoGen implementation
│   ├── metagpt/        # MetaGPT implementation
│   ├── agentverse/     # AgentVerse implementation
│   ├── camel/          # CAMEL implementation
│   └── mad/            # MAD implementation
└── utils/              # Utility functions and tools
```

## 🤝 Contributing

We welcome contributions! Please feel free to submit issues, feature requests, or pull requests.

### Adding New Multi-Agent Systems

1. Create a new directory in `maspi/mas/`
2. Implement the required interface from `BaseMAS`
3. Add agent implementations in the `agents/` subdirectory
4. Update the factory pattern in `maspi/utils/factory.py`

### Adding New Attacks

1. Create a new attack class in `maspi/attacks/`
2. Inherit from `BaseAttack`
3. Implement the attack logic
4. Update the attack registry

### Adding New Defenses

1. Create a new defense class in `maspi/defenses/` (Optional)
2. Implement the defense logic in `maspi/evaluation/task_executor.py`

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 📚 Citation

If you use MASpi in your research, please cite our paper:

```bibtex
@article{maspi2024,
  title={MASpi: A Unified Environment for Evaluating Prompt Injection Robustness in LLM-Based Multi-Agent Systems},
  author={Your Name and Co-authors},
  journal={Conference/Journal Name},
  year={2024}
}
```

## 🙏 Acknowledgments

Special thanks to [MASLab](https://github.com/MASWorks/MASLab) for their pioneering work in multi-agent systems research and foundational contributions.

---

For more information, please visit our [documentation (Coming Soon)](docs/) or open an issue on GitHub.